Optimizing the finite-state description of Estonian morphology
نویسنده
چکیده
The research on modeling the Estonian morphology by finite state devices has been influenced mostly by (Koskenniemi, 1983), (Lauri Karttunen and Zaenen, 1992) and (Beesley and Karttunen, 2000). We have used lexical transducer combined with twolevel rules as a general model for describing Estonian morphology. As a novel approach we can emphasize the application of the rules to the both sides of the lexical transducer – both to the lexical representation and to the lemma. In the paper the criteria of optimality of the finite-state description of a natural language morphology and the means of fulfilling these criteria are discussed on the example of Estonian – a language with very rich and complex morphology. Other builders of finite-state morphological transducers may profit from the ideas proposed.
منابع مشابه
On Using the Two-level Model as the Basis of Morphological Analysis and Synthesis of Estonian
The paper deals with the problems of describing the Estonian morphological system in the two-level formalism, developed by Kimmo Koskenniemi, The outlines of Estonian morphology are drawn. The basics of the two-level model are given and illustrated with real examples from the experimental Estonian two-level morphology (EETwoLM) composed by the author. A detailed example of step-by-step morpholo...
متن کاملFinite-State Morphology of Estonian: Two-Levelness Extended
The paper is concentrated on modeling the Estonian morphology in the framework of twolevel morphology model. The result is a consistent description of Estonian morphology, which consists of a network of lexicons (root lexicons cover 2500 most frequent word roots) and two-level rules. The main rule set contains 45 rules, which describe various stem changes. The subset of rules dealing with stem ...
متن کاملExperimental Two-Level Morphology of Estonian
The experimental two-level morphology of Estonian is under development at the University of Tartu. The language description, consisting of 45 two-level rules and over 200 lexicons has been implemented and tested using Xerox finite-state tools twolc and lexc. The root lexicons cover 400 most frequent stems at the present stage of development. The software has been designed to update the lexicon ...
متن کاملParallel Forms in Estonian Finite State Morphology
Parallel forms are two or more synonymous forms that convey an identical set of morpho-syntactic categories in a paradigm cell of a word. They deserve attention from a theoretical linguistic, as well as from a computational point of view. How do humans know which form to choose, and how should this preference be modelled computationally? The paper gives an overview of parallel forms in Estonian...
متن کاملConsonant Gradation in Estonian and Sámi: Two-Level Solution
Koskenniemi’s two-level morphology was the first practical general model in the history of computational linguistics for the analysis of morphologically complex languages. In this article we will reconsider one of the key innovations in Koskenniemi (1983), namely the treatment of consonant gradation in finite state transducers. We will look not at Finnish, but at two languages with a more exten...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005